Causal Inference, Hypothesis Testing, Z-scores
POLS 3316: Statistics for Political Scientists

Tom Hanna

2023-10-28

Where we were

  • Standard Errors - distance between sample and population data

  • Z-scores - probability that sample represents the true population data

              + Z- Score tables

Today

  • Look at Z-tables and formula
  • Talk about statistics and causal inference

Statistics: Cause and effect

  • What is a cause?
  • Causes are complicated!
  • The Fundamental Problem
  • Hypothesis Testing
  • So what does a hypothesis test tell us?
  • Bayes Rule Again

What is a cause?

What is a cause?

  • the thing that Y would not happen without

What is a cause?

  • the thing that Y would not happen without
  • If X does not exist, Y will not happen

What is a cause?

  • the thing that Y would not happen without
  • If X does not exist, Y will not happen

Except…

Causes are complicated!

Causes are complicated!

  • Multiple causes

Causes are complicated!

  • Multiple causes
  • Causes may be necessary but not sufficient (conditional)

Causes are complicated!

  • Multiple causes
  • Causes may be necessary but not sufficient (conditional)
  • Causes may be sufficient but not necessary (more than one possible sufficient cause)

Causes are complicated!

  • Multiple causes
  • Causes may be necessary but not sufficient (conditional)
  • Causes may be sufficient but not necessary (more than one possible cause)
  • Some causes may be neither

Causes are complicated!

  • Multiple causes
  • Causes may be necessary but not sufficient (conditional)
  • Causes may be sufficient but not necessary (more than one possible cause)
  • Some causes may be neither
  • The randomness factor (stochastic factor)

Causes are complicated!

  • Multiple causes
  • Causes may be necessary but not sufficient (conditional)
  • Causes may be sufficient but not necessary (more than one possible cause)
  • Some causes may be neither
  • The randomness factor (stochastic factor) - - Intermediate steps (mediating or moderating variables)

Causes are complicated!

  • Multiple causes
  • Causes may be necessary but not sufficient (conditional)
  • Causes may be sufficient but not necessary (more than one possible cause)
  • Some causes may be neither
  • The randomness factor (stochastic factor) - - Intermediate steps (mediating or moderating variables)
  • Reverse causation (endogeneity)

These are easy compared to the big issue…

The Fundamental Problem of Causal Inferance

The Fundamental Problem

  • We can’t observe “what if”?

The Fundamental Problem

  • We can’t observe the “what if”?
  • Technical term: counterfactual

The Fundamental Problem

  • We can’t observe the “what if”?
  • Technical term: counterfactual
  • We don’t know if Y would have happened without X because X did happen

The Fundamental Problem

  • We can’t observe the “what if”?
  • Technical term: counterfactual
  • We don’t know if Y would have happened without X because X happened
  • Huge problem for observational studies

The Fundamental Problem

  • We can’t observe the “what if”?
  • Technical term: counterfactual
  • We don’t know if Y would have happened without X because X happened
  • Huge problem for observational studies
  • Experimental design manipulates the data generation: partial solution

The Fundamental Problem

  • We can’t observe the “what if”?
  • Technical term: counterfactual
  • We don’t know if Y would have happened without X because X happened
  • Huge problem for observational studies
  • Experimental design manipulates the data generation: partial solution
  • Observational studies rely on our treatment of the data: partial solution

Approach hypothesis testing

  • Our approach to hypothesis testing is part of the solution to the fundamental problem
  • Our interpretation of hypothesis testing is driven by the fundamental problem

Hypothesis Testing

  • Null hypothesis: effect on Y is due to random chance

Hypothesis Testing

  • Null hypothesis: effect on Y is only due to random chance
  • Null hypothesis: as if X didn’t exist

Hypothesis Testing

  • Null hypothesis: effect on Y is only due to random chance
  • Null hypothesis: as if X didn’t exist
  • design the model so that: null hypothesis ~ the counterfactual

Hypothesis Testing

  • Null hypothesis: effect on Y is only due to random chance
  • Null hypothesis: as if X didn’t exist
  • design the model so that: null hypothesis ~ the counterfactual

That’s aspirational

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

  • Z < 1.96

So what does a hypothesis test tell us?

  • Z < 1.96: “the null hypothesis is retained”

So what does a hypothesis test tell us?

  • Z < 1.96: “the null hypothesis is retained”

      - The Theory is Wrong

So what does a hypothesis test tell us?

  • Z < 1.96: “the null hypothesis is retained”

      - The Theory is Wrong
      - As written

So what does a hypothesis test tell us?

  • Z < 1.96: “the null hypothesis is retained”

      - The Theory is Wrong
      - As written
      - In some way

So what does a hypothesis test tell us?

  • Possible: “the null hypothesis is retained”

      - The Theory is Wrong
      - As written
      - In some way
  • Z > 1.96:

So what does a hypothesis test tell us?

  • Possible: “the null hypothesis is retained”

      - The Theory is Wrong
      - As written
      - In some way
  • Z > 1.96: “the null hypothesis is rejected”

So what does a hypothesis test tell us?

  • Possible: “the null hypothesis is retained”

      - The Theory is Wrong
      - As written
      - In some way
  • Z > 1.96: “the null hypothesis is rejected”

      - The Theory is Right??

So what does a hypothesis test tell us?

  • Possible: “the null hypothesis is retained”

      - The Theory is Wrong
      - As written
      - In some way
  • Z > 1.96: “the null hypothesis is rejected”

      - The Theory is Right??

NO!!!!!!

So what does a hypothesis test tell us?

  • Z > 1.96: “the null hypothesis is rejected”

The evidence supports the hypothesis.

So what does a hypothesis test tell us?

  • Z > 1.96: “the null hypothesis is rejected”

The evidence supports the hypothesis.

The evidence is consistent with the theory.

So what does a hypothesis test tell us?

  • Z > 1.96: “the null hypothesis is rejected”

The evidence supports the hypothesis.

The evidence is consistent with the theory.

The null hypothesis is rejected and the evidence is consistent with the hypothesized effect.

So what does a hypothesis test tell us?

  • Z > 1.96: “the null hypothesis is rejected”

The evidence supports the hypothesis.

The evidence is consistent with the theory.

The null hypothesis is rejected and the evidence is consistent with the hypothesized effect.

What about certainty and proof?

Back to Bayes Rule

Bayes Rule

\[\\[6in]\]

This way madness lies

Everything after here is draft notes for your reading. Beware of typos, etc.

Some of the things in these notes are from courses I took, some are from assorted books, some are from these two sources which are at least somewhat readable and free:

https://egap.org/resource/10-things-to-know-about-hypothesis-testing/

https://egap.org/resource/10-things-to-know-about-causal-inference/

  • We need to understand what statistics can tell us about causation
  • Correlation doesn’t prove causation but that doesn’t mean statistics can’t help
  • We need to be precise about what we mean by a cause
  • We need to understand the limits of data and statistics
  • We need to understand the capabilities of data and statistics

\[\\[5in]\]

What do we know about causation?

  1. Correlation \(\notequal\) causation.

             + Correlation does imply a relationship that may involve some cause and effect somewhere.
             + The relationship could go either direction
             + The relationship could involve other variables
             + Lack of correlation doesn't necessarily mean anything -- correlation is linear and causal effects aren't always linear
  2. A cause is a claim about something that didn’t happen.

             + If we say X caused Y, we mean: *If X didn't happen, Y would not happen, everything else being held the same.*
  3. The Fundamental Problem of Causal Inference

             + Our proposed cause, which did happen, is the factual
             + The thing that didn't happen is called the *counterfactual*
             + We can't actually observe the thing that didn't happen
             + The inability to observe the counterfactual is the fundamental problem of causal inference
             + Experiments are a potential way around this
  4. Causes have to involve a possible manipulation of circumstances so that the counterfactual occurred

  5. Statistics looks for average causal effects, not single data points or individual effects. The average effects may conflict with anecdotal evidence. This is partially because…

  6. There can be multiple causes. (Causes are non-rival.)

  7. Causes can be necessary, sufficient, neither, or both and still be causes.

  8. It’s a lot easier to measure effects than to find causes.

What can statistics do for us?

  • The Null Hypothesis and counterfactuals

              + We can measure the probability an effect is due to random chance (the null hypothesis)
              + Formal hypothesis tests give us this value, the *p-value*
              + Theory provides an *alternative hypothesis* which we believe to be true based on the theory
              + Well designed hypotheses can help with the unobserved counterfactual
              + When we reject the null, we can determine that "the evidence is consistent with the alternative hypothesis" and the theory